Speech-Driven Facial Animation Using a Shared Gaussian Process Latent Variable Model

نویسندگان

  • Salil Deena
  • Aphrodite Galata
چکیده

In this work, synthesis of facial animation is done by modelling the mapping between facial motion and speech using the shared Gaussian process latent variable model. Both data are processed separately and subsequently coupled together to yield a shared latent space. This method allows coarticulation to be modelled by having a dynamical model on the latent space. Synthesis of novel animation is done by first obtaining intermediate latent points from the audio data and then using a Gaussian Process mapping to predict the corresponding visual data. Statistical evaluation of generated visual features against ground truth data compares favourably with known methods of speech animation. The generated videos are found to show proper synchronisation with audio and exhibit correct facial dynamics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shared Gaussian Process Latent Variable Model for Multi-view Facial Expression Recognition

Facial-expression data often appear in multiple views either due to head-movements or the camera position. Existing methods for multi-view facial expression recognition perform classification of the target expressions either by using classifiers learned separately for each view or by using a single classifier learned for all views. However, these approaches do not explore the fact that multi-vi...

متن کامل

Stylized synthesis of facial speech motions

Stylized synthesis of facial speech motions is central to facial animation. Most synthesis algorithms put emphasis on the reasonable concatenation of captured motion segments. The dynamic modeling of speech units, e.g. visemes and visyllables (the visual appearance of a syllable), has not drawn much attention. In this paper, we address the fundamental issues regarding the stylized dynamic model...

متن کامل

Repurposing hand animation for interactive applications

In this paper we describe a method for automatically animating interactive characters based on an existing corpus of key-framed hand-animation. The method learns separate low-dimensional embeddings for subsets of the hand-animation corresponding to different semantic labels. These embeddings use the Gaussian Process Latent Variable Model to map high-dimensional rig control parameters to a three...

متن کامل

End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech

We present a deep learning framework for realtime speech-driven 3D facial animation from just raw waveforms. Our deep neural network directly maps an input sequence of speech audio to a series of micro facial action unit activations and head rotations to drive a 3D blendshape face model. In particular, our deep model is able to learn the latent representations of time-varying contextual informa...

متن کامل

Using GPLVM for Inverse Kinematics on Non-cyclic Data

We apply the Gaussian Process Latent Variable Model (GPLVM) to tackle the inverse kinematic problem in character animation during a ball catching scenario. The goal is to generate realistic upper-body movements with only the tip of the hand specified as the constraint. We ran a series of motion capture experiments to capture the body movement in a subject as he performs a ball catching task and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009